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Abstract 

We consider a service system model primarily motivated by the problem of efficient assignment of 
virtual machines to physical host machines in a network cloud, so that the number of occupied hosts is 
minimized. 

There are multiple input flows of different type customers, with a customer mean service time de- 
pending on its type. There is infinite number of servers. A server packing configuration is the vector 
k = {ki}, where ki is the number of type i customers the server "contains". Packing constraints must be 
observed, namely there is a fixed finite set of configurations k that are allowed. Service times of different 
customers are independent; after a service completion, each customer leaves its server and the system. 
Each new arriving customer is placed for service immediately; it can be placed into a server already 
serving other customers (as long as packing constraints are not violated), or into an idle server. 

We consider a simple parsimonious real-time algorithm, called Greedy, which attempts to minimize 
the increment of the objective function ^2 k X k +a , a > 0, caused by each new assignment; here X k is 
the number of servers in configuration k. (When a is small, VJ fe X k +a approximates the total number 
Xk of occupied servers.) Our main results show that certain versions of the Greedy algorithm are 
asymptotically optimal, in the sense of minimizing X k +OL in stationary regime, as the input flow 
rates grow to infinity. We also show that in the special case when the set of allowed configurations 
is determined by vector-packing constraints, Greedy algorithm can work with aggregate configurations 
as opposed to exact configurations k, thus reducing computational complexity while preserving the 
asymptotic optimality. 



1 Introduction 



The primary motivation for this work is the following problem arising in cloud computing: how to assign 
various types of virtual machines to physical host machines (in a data center) in real time, so that the total 
number of host machines in use is minimized. It is very desirable that an assignment algorithm is simple, 
does need to know the system parameters, and makes decisions based on the current system state only. (An 
excellent overview of this and other resource allocation issues arising in cloud computing can be found in [3]-) 

A data center (DC) in the "cloud" consists of a number of host machines. Assume that all hosts are same: 
each of them possesses the amount B n > of resource n, where n € {1, 2, . . . , N} is a resource index. (For 
example, resource 1 is CPU, resource 2 is memory, etc.) The DC receives requests for virtual machine (VM) 
placements; VMs can be of different types i g {1,...,/}; a type i VM requires the amounts bi^ n > of each 
resource n. Several VMs can share the same host, as long as the host's capacity constraints are not violated; 
namely, a host can simultaneously contain a set of VMs given by a vector k = {k%, . . . , fc/), where ki is the 
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number of type i VMs, as long as for each resource n 

^2kib it „<B n . (1) 

i 

Thus, VMs can be assigned to hosts already containing other VMs, subject to the above "packing" con- 
straints. After a certain random sojourn (service) time each VM vacates its host (leaves the system), which 
increases the "room" for new arriving VMs to be potentially assigned to the host. A natural problem is to 
find a real-time algorithm for assigning VM requests to the hosts, which minimizes (in appropriate sense) 
the total number of hosts in use. Clearly, such a scheme will maximize the DC capacity; or, if it leaves a 
large number of hosts unoccupied, those hosts can be (at least temporarily) turned off to save energy. 

More specifically, the model assumptions that we make are as follows: 

(a) The exact nature of "packing" constraints will not be important - we just assume that the feasible 
configuration vectors k (describing feasible sets of VMs that can simultaneously occupy one host) form a 
finite set /C; and assume monotonicity — if k S K, then so is any k' < k. 

(b) There is no limit on the number of hosts that can be used and each new VM is assigned to a host 
immediately - so it is an infinite server model, with no blocking or waiting. 

(c) Service times of different VMs are independent of each other, even for VMs served simultaneously on the 
same host. 

(d) We further assume in this paper that the arrival processes of VMs of each type are Poisson and service 
time distributions are exponential. These assumptions are not essential and can be much relaxed, as discussed 
in Section HH 

The basic problem we address in this paper is: 

minimize ^X* +Q , (2) 

k 

where a > is a fixed parameter, and X^ is the (random) number of hosts having configuration k in the 
stationary regime. (Clearly, when a is small, X^ +a approximates the total number Xk of occupied 
hosts.) We consider the Greedy real-time (on-line) VM assignment algorithm, which, roughly speaking, tries 
to minimize the increment of the objective function ^2 k X^ +a caused by each new assignment. Our main 
results show that certain versions of the Greedy algorithm are asymptotically optimal, as the input flow rates 
become large or, equivalcntly, the average number of VMs in the system becomes large. We also show (in 
Section [7]) that in the special case when feasible configurations are determined by constraints (JXJ) , Greedy 
algorithm can work with "aggregate configurations" as opposed to exact configurations k, thus reducing 
computational complexity while preserving the asymptotic optimality. 

1.1 Previous work 

Our model is related to the vast literature on the classical stochastic bin packing problems. (For a good 
recent review of one-dimensional bin packing see e.g. [2].) In particular, in online stochastic bin packing 
problems, random-size items arrive in the system and need to be placed according to an online algorithm 
into finite size bins; the items never leave or move between bins; the typical objective is to minimize the 
number of occupied bins. A bin packing problem is multi- dimensional, when bins and item sizes are vectors; 
the problems with the packing constraints ([TJ are called multi- dimensional vector packing (see e.g. pQ for 
a recent review). Bin packing service systems arise when there is a random in time input flow of random- 
sized items (customers), which need to be served by a bin (server) and leave after a random service time; 
the server can simultaneously process multiple customers as long as they can simultaneously fit into it; the 
customers waiting for service are queued; a typical problem is to determine the maximum throughput under 
a given ("packing") algorithm for assigning customers for service. (See e.g. [3] for a review of this line of 
work.) Our model is similar to the latter systems, except there are multiple bins (servers), in fact - infinite 
number in our case. Models of this type are more recent (see e.g. [51(5] )■ Paper [5] addresses a real-time 
VM allocation problem, which in particular includes packing constraints; the approach of [5] is close in 
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spirit to Markov Chain algorithms used in combinatorial optimization. Paper [5] is concerned mostly with 
maximizing throughput of a queucing system (where VMs can actually wait for service) with a finite number 
of bins. 

The asymptotic regime in this paper is such that the input flow rates scale up to infinity. In this respect, 
our work is related to the (also vast) literature on queueing systems in the many servers regime. (See e.g. [3] 
for an overview. The name "many servers" reflects the fact that the number of servers scales up to infinity 
as well, linearly with the input rates; this condition is irrelevant in our case of infinite number of servers.) 
In particular, we study fluid limits of our system, obtained by scaling the system state down by the (large) 
total number of customers. We note, however, that packing constraints are not present in the previous work 
on the many servers regime, to the best of our knowledge. 



2 Model and main results 

We consider a service system with / input customer flows of different types, indexed by i £ {1,2,...,/} =1. 
Each flow i is Poisson with rate A, > 0. Service time of a type i customer is an exponentially distributed 
random variable with mean All input flows and customer service times are mutually independent. There 

is an infinite number of servers. Each server can potentially serve more than one customer simultaneously, 
subject to the following general "packing" constraints. We say that a vector k = {ki, i 6 1} with non- 
negative integer components is a server configuration, if a server can simultaneously serve a combination 
of different type customers given by vector k. The set of all configurations is finite, and is denoted by 1C. 
We assume that k £ K, implies that all "smaller" configurations k 1 < k belong to K, as well. Without loss 
of generality assume that e% £ K, for all types i, where is the i-th coordinate unit vector (otherwise, 
i-customers cannot be served at all). By convention, the (component- wise) zero vector k = belongs to K, 
- this is the configuration of an "empty" server; we denote by K, = K, \ {0} the set of configurations not 
including the zero configuration. 

An important feature of the model is that simultaneous service does not affect the service rates of individual 
customers; in other words, the service time of a customer is unaffected by whether or not there are other 
customers served simultaneously by the same server. Each arriving customer is immediately placed for 
service in one of the servers; it can be "added" to an empty or non-empty server as long as configuration 
feasibility constraint is not violated, i.e. it can be added to any server whose configuration k £ K, (before 
the addition) is such that k + e.; £ /C. When the service of an i-customer by the server in configuration k is 
completed, the customer leaves the system and the server's configuration changes to k — a. Denote by Xk 
the number of servers with configuration k £ K,. The system state is then the vector X = {Xk, k £ K.}. 

A service discipline ("packing rule") determines where an arriving customer is placed, as a function of the 
current system state X. Under any well-defined service discipline, the system state at time t, X(t), is a 
continuous time, countable Markov chain. It is easily seen to be irreducible and positive recurrent; the 
positive recurrence follows from the fact that the total number Yi(t) of type i customers in the system is 
the process independent of the service discipline, and its stationary distribution is Poisson with mean Aj///j. 
Therefore, the process X(t), t > 0, has a unique stationary distribution. 

We are interested in finding a service disciplines minimizing (in a certain sense) the total number of non- 
empty servers in the stationary regime. For example, an objective can be 



where a > is a parameter, and X(oo) denotes the random system state in stationary regime. Another 
possible objective is 




(3) 
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where C is a fixed threshold. In both problems, setting a = obviously corresponds to the exact objective of 
minimizing the number of servers in use; however, if a good discipline for a > exists, using such discipline 
with positive a close to would (hopefully) also produce a good solution for the a = case. 

In this paper we consider objectives ([3]) and (|4|) with a > 0, and prove the following Greedy service discipline 
(or, rather its slight modification, to be precise) is asymptotically optimal as the flows' input rates become 
large. 

Definition 1 (Greedy discipline). 

1. Integral form (Greedy-I). Define F(x) = 52fcpjc(l a )~ lx k +a ' x ^ • A type i customer arriving at 
time t is added to an available configuration k (with either k = or Xk(t—) > 0) such that k + e, £ JC 
and the increment F(X(t))—F(X(t—)) is the smallest. (Here X(t—) and X(t) are the states just before 
and just after the addition; so that Xk+ ei (t) = Xk +ei (t—) + 1 and, unless k = 0, Xk(t) = X^(t—) — 1.) 
The ties are broken according to an arbitrary deterministic rule. 

2. Differential form (Greedy-D). For each k EK, denote Wk{x) = (d/dx x )F(x) = xZ, x £ W+. A type i 
customer arriving at time t is added to an available configuration k (with either k = or Xk(t—) > 0) 
such that k + Ci E JC and the difference Wk+ £i (X(t— )) — I{k =/= 0}Wk(X(t— )) is the smallest. (Here 
/{•} is the indicator function, equal to 1 when the condition holds and otherwise.) The ties are broken 
according to an arbitrary deterministic rule. 

In the asymptotic analysis of this paper, the two forms of Greedy algorithm are essentially identical, because 
they induce the same dynamics of the system in the "fluid limit" . We will analyze the differential form, as 
it is slightly more convenient to work with, and is probably more easily implementable in practice; it should 
be clear that all results (along with essentially same proofs) hold for the integral form as well. 

Remark. All results will hold for a more general objective function F(x) = ^ k£K c k x k °i with arbitrarily 
positive weights Ck- The generalization is completely straightforward - we choose to work with = (1 + a) -1 
simply to avoid "carrying" factors Cfe(l + a), which clog notation. 

We now define the asymptotic regime. Let r — > oo be a positive scaling parameter. (To be specific, assume 
that r > 1, and r increases to infinity along a discrete sequence.) Input rates scale linearly with r; namely, 
for each r, Aj = A 2 r, where A 2 ; are positive parameters. Let X r (-) be the process associated with system 
with parameter r, and X r (oo) be the (random) system state in the stationary regime. For each i denote by 
Y[{f) = J2keK ^j^fe(^) the total number of customers of type i. Since arriving customers are taken for service 
immediately and their service times are independent (of the rest of the system), the distribution of Yf(oo) 
is Poisson with mean rpi, where p% = A,/ 'fa. Moreover, Yf (oo) are independent across i. Since the total 
number of occupied servers is no greater than the total number of customers, J2 k Xf,(t) < Z r (f) = J2i Yf(t), 
we have a simple upper bound on the total number of occupied servers in steady state, J2k ^fc(°°) — ■^ r (°°)i 
where Z r (oo) is a Poisson random variable with mean r^^pi. Without loss of generality, from now on in 
the paper we assume pi = 1. (This is equivalent to rechoosing parameter r to be r y~], pi.) 

The fluid scaled process is x r (t) = X r (t)/r; for any r, x r (t) takes values in (the positive orthant of) Euclidean 
space W K \, where \JC\ is the cardinality of K. Similarly, yl(t) = Y[(t)/r and z r (t) = Z r (t)/r. Since 
J] fe 2^(00) < z r (oo) = Z r (oo)/r, we see that the random variables (J2k x k(°°)) 1+a are uniformly intcgrable 
in r (for any fixed a > 0). This in particular implies that the sequence of distributions of a; r (oo) is tight, 
and therefore there always exists a limit in distribution x r (oo) =>• x(oo), along a subsequence of r. (The 
limit depends on the service discipline, of course.) The limit (random) vector x(oo) satisfies the following 
conservation laws: 




(5) 
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implying, in particular, 




(6) 
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Therefore, the values of x(oo) are confined to the convex compact |/C| — /-dimensional polyhedron 

X = {xe R |7C| | x k > 0, Vfc e JC; ^ kxk = P*, Vi e I}. 

k 

(We will slightly abuse notation by using symbol x for a generic element of X; while sc(oo), and later x(t), 
refer to random variables taking values in X.) 

The asymptotic regime and the associated basic properties (j&J and ([6]) hold for any service discipline, not 
necessarily Greedy-D. 

Note that function F(x) with a > is strictly convex on X , and therefore there is the unique optimal point 
x* minimizing F: 

x* = argmini^w). (7) 

u&X 

(Note that, if a = 0, then x* is an optimal solution of a linear program and therefore might not be unique. 
We do not consider the a = case in this paper.) 

Our main results are as follows. 

Let a > 0. We prove that, as r — > oo, the convergence is distribution x r (oo) => x* holds in two cases: 

(a) For the closed system, with the constant number Y[ = pir of customers of each type, operating under the 
Greedy-D discipline. (The exact result is Theorem^) 

(b) For the original system (as defined above), operating under a slightly modified Greedy-D discipline, called 
Greedy-DM. (The exact result is Theorem] 121) 

In addition, in the special case when feasible configurations are determined by vector-packing constraints (Qp, 
we prove that essentially same results hold if Greedy algorithm uses quantities aggregated over classes of 
equivalent configurations, thus reducing the total number of system variables the algorithm needs to maintain. 
(The exact results are Theorems \19\ and \21l ) 



2.1 Basic notation and conventions 

Standard Euclidean norm of a vector x € R n is denoted ||x||; the distance from vector x to a set U in 
a Euclidean space is denoted d{x,U) = mi u ^u \\x — u||; R+ is the set of real non-negative numbers; the 
cardinality of a finite set M is \M\ . Symbol — ¥ means ordinary convergence in R™ ; denotes convergence 

in distribution of random variables taking values in R™, equipped with the Borel cr-algebra; abbreviation 
w.p.l means convergence with probability 1. We often write x(-) to mean the function (or random process) 
x{t), t > 0. Abbreviation u.o.c. means uniform on compact sets convergence of functions, with the argument 
(usually in [0, oo)) determined by the context. We often write {xk} to mean the vector {xk, k S /C}, with 
the set of indices K. determined by the context. 

For a finite set of scalar functions f n {t), t > 0, n € TV, a point t is called regular if for any subset Af' C Af 
the proper derivatives 

— max/ n (i) and — min f n (t) 
at neJV' at neJ\' 

exist. 



3 Closed system. Greedy-D optimality 

In this section we consider a "closed" version of our system. Namely, assume that there is a fixed number 
Pir customers of type i (in a system with parameter r); there are no exogenous arrivals into the system - 
when a service of a type i customer is completed, the customer immediately has to be placed into a server 
for a new service. Service discipline determines where the customer is placed, based on the current system 
state. 



■5 



It is easy to see that, for any r, a stationary distribution of the process exists under any given service 
discipline, because the process in this case is a finite-state continuous time Markov chain. 

The main result of this section is the following 

Theorem 2. Consider a sequence of closed systems, indexed by r, and let x r {oo) denote the random state of 
the (fluid- scaled) process is a stationary regime, under the Greedy-D discipline with a > 0. Then, as r — > oo, 

x r (oo) => X* , 

where x* is defined in Hy. 

To prove the theorem we will need to study the transient behavior of the fluid-scaled process and its limits. 

Let Ai denote the set of pairs (k, i) such that k G K, and k — G K. Each pair (k, i) is associated with the 
"edge" (k — ei, k) connecting configurations k — and k; often we refer to this edge as (k, i). By "arrival 
along the edge (fc, i) n we will mean placement of a type i customer into a server configuration k — £j to form 
configuration k; similarly, "departure along the edge is a departure of a type i customer from a server 

in configuration k, which changes its configuration to k — e,. 

For each (k, i) G Ai, consider an independent unit-rate Poisson process Hfei (£) , i > 0. We have the functional 
strong law of large numbers: 

-H k i(rt) -> t, u.o.c, w.p.l. (8) 

r 

Without loss of generality, assume that the Markov process X r (-) for each r is driven by the common set 
of Poisson processes Ilfcj(-), as follows. For each (k,i) G Ai, let us denote by D r ki {t) the total number of 
departures along the edge (k,i) in [0, £]; then we can assume that 

D r ki (t) = ^ki(f t X r k (Oh^idO- (9) 
Jo 

Each type-i departure in the closed system is simultaneously a type-i "arrival" , which is allocated according 
to Greedy-D. Thus, the realization of the process is (w.p.l) uniquely determined by the initial state X r (0) 
and the realizations of Ilfej (•)• Denote by A ki (t) the total number of arrivals allocated along edge (k,i). 
Obviously, wc have the conservation law for each type i: J2k ^L(^) = Sfe > 0. In addition to 

x r k (t) = - r Xl{t), 

we introduce other fluid-scaled quantities: 

dlS) = aUt) = l -AUt)- 

A set of Lipschitz continuous functions [{a;fe(-)i k G £■} > {du(') > G Ai},{aki(-), (k,i) G Ai}] on the 

time interval [0, oo) we call a fluid sample path (FSP), if there exist realizations of Hki{-) satisfying ((SJ) and 
a fixed subsequence of r, along which 

[{4(0, fce £},{<&(•), (k,i)GM},{aU-), {k,i)eM}]^ 
[{x k (-), keJC},{d ki (-), (k,i) eM},{a ki (-), (k,i) EM}}, u.o.c. (10) 

It is easy to see that the family of all FSPs is uniformly Lipschitz. 

Lemma 3. Suppose the initial states x r (0) are fixed and are such that x r (0) — > x(0). Then, w.p.l for any 
subsequence of r there exists a further subsequence of r, along which the convergence M0\) holds, where the 
limit is an FSP. 
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Proof is very standard. Essentially, it suffices to observe that, with probability 1, each sequence (in r) of 
functions d ki (-) is asymptotically Lipschitz, namely for some C > 0, and all < ti < t2 < oo, 

limsupdk(t a )-dk(ti) <C(t 2 -h), 

r 

which in turn follows from ([8]). And similarly for functions a ki (-), because a r ki (t) < J^k <^L(*)- Using this and 
(|10[) . we easily verify the u.o.c. convergence of all x k (-), d r ki {-), a ki (-), along possibly a further subsequence, 
and the fact that the limits are Lipschitz. We omit details. □ 

For an FSP, at a regular time point t, we denote Vki(t) = (d/dt)aki(t) and Wki(t) = (d/dt)dki(t). In other 
words, Vki(t) and Wki(t) are the rates of type i "fluid" arrival and departure along edge (fc, i), respectively. 

Lemma 4. An FSP satisfies the following properties: 

yi(t) = E k i X k(t) = Pi', 
k 

at any regular point t, 

Wki(t) = kifJ>iXk, V(k,i)eM, 
^ Wki{t)= E v ki (t) = \i, Viel, 

k:{k,i)£M k:(k,i)£M 

(d/dt)x k (t) = [ ^2 v ki - ^2 v k+ei ,i\ - [ E w ki - ^ w k +e t ,i], Vfc e K.. 

i-.k-ei^K i-.k+aeK Uk-aeR i:fc+e;£/C 

Proof is both standard and obvious. □ 



4 Characterization of the optimal point x* and related properties 

This section describes properties of the optimal point x* , and related general properties of the "allocation" 
vectors, which, roughly speaking, have the meaning of the vector v(t) = {vki(t), (fc, i) £ A4} of arrival rates. 
The results of this section are not limited the closed system or the Greedy-D algorithm. 

Recall that x* is defined as the unique optimal solution of the convex optimization problem 

min F(x) (11) 

subject to 

E hxk = Pi, Vi, (12) 

where F(x) = + ay 1 x 1 k +a with a > 0. 

Using Lagrange multipliers rji for the constraints (|12p , the Lagrangian is 

E rr^ x fc +a + E viipt - E = J2^YT^ xlk+a ~ Xk ^ Vlkl] + ^ mpi - 

k i k k i i 

Therefore, we have the following characterization: vector x — x* if and only if there exist constants rji such 
that 

x k = max{y^ hr]i, 0}. (13) 

i 

From here we also observe that at least one of the Lagrange multipliers is strictly positive, rji > 0, and 
therefore, for such i, x*. > 0. 
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For x G and (k, i) G M denote 

A ki = A ki (x) = x% - x%_ ei , 
where, by convention, x k = when k = 0. 

Definition 5 (Allocations). ^4 vector 7 = {7^, (fc, i) € site/* t/iai all "f k i > 0, 

^ 7fc» = Ai, Vi, 

fc : (k,i)£M 

we will call an allocation (of the arrival rates). For a given x G X , the allocation 7' = Y(x), with components 
"f'ki = ^iMi^fc* * s called neutral. An allocation 7 is called a simple improving allocation, or Si-allocation, 
for a given x G X, if there exist edges (k,i) and (k',i), with the same i but k' ^ k, such that the following 
conditions hold: 

h > 0, x k > 0, 

either k' = e, or [k[ > and x k >- ei > 0], (14) 
A k >i < A kz , (15) 
Iki = 0, Jk'i = K^i x k' + h^iXk, 
It] = lij = ijfijXt, for ^ (k, i), (k', i). 
We denote by T(x) the set consisting of the neutral and all SI- allocations for x. 

The meaning of this definition is simple and becomes clear with the use of following notation. For x G X, 
denote 

D(-y) = D(j, x) = ^ Akihki - hlMXk]- 

(k,i)eM 

(The meaning of this is: D(v(t), x(t)) = (d/dt)F(x(t)) for an FSP at a regular point t.) Clearly, D(^') = 
for the neutral allocation, and D(^) < for any Si-allocation. This is because any Si-allocation 7, associated 
with edges (k,i) and (k 1 ,i), is obtained from the neutral by "reallocating" the positive amount kifiiX k of 
type-i input rate from edge (k,i) to edge (k',i) with strictly smaller Ak'ii condition (fT4|) guarantees that 
there are servers in the configuration k — e^, to which these reallocated type-i arrivals can be added. 

Lemma 6. If x G X and x 7^ x* , then there exists at least one Si-allocation 7 G T(x). 



Proof is by contradiction. Suppose SI- allocations do not exist. Then there must exist at least one i G I such 
that x ei > 0. If not, we pick a minimal k with x k > 0, for which necessarily k% > 2. Then we could pick i 
such that ki > 1 and construct the Si-allocation 7 associated with edges (k, i) and (e^, i). (I.e., 7 is obtained 
from the neutral allocation by "reallocating" amount kifJ,iX k of type i input from edge (k,i) to edge (e,,i).) 
Therefore, indeed, x &i > for at least one i. 

Denote by I + the set of those i with x &i > 0, and by Z~ = the set of those i with x ei = 0. Set I + is 

non-empty, while X~ may or may not be empty. Let us fix the following values of r]i : 

_ f a£> if i GX+ 

Vl ~ \ min fc . ( k ,i)£M A fc,:, if i G X~ 

It follows from this definition that r\i > if and only if i G X + . 

Let us show that (|13[) must hold for these r\i. Denote u k = max{^ i fcj??j, 0}; so, we will show that x k = u k for 
all k. Suppose not. Let us choose a minimal k for which this relation fails. Note that necessarily J^* ki > 2. 

Consider first the case when fc, = for all i G X~. Pick any i G I + for which fcj > 1. If > life, then 
A k i > T}i] we can then construct the Si-allocation associated with the reallocation from (k,i) to (ei,i); this 



is a contradiction. If x? < Uk, then Aki < rji, and we can construct the Si-allocation, which does the 
"opposite" reallocation from (ej,i) to (k, i); again, a contradiction. 

The second case is when fc, > 1 for at least one i G I - . We pick such an i. Condition < Uk would imply 
Aki < r)i — a contradiction with the definition of r\i. Therefore, x^ > Uk, and then Xk > and Aki > rji. 
If rji = 0, then we can construct the Si-allocation associated with the reallocation from (k, i) to (ei,i). 
Otherwise, if r/i < 0, there exists an edge (k',i) such that Ak'i = i] i < and, consequently, Xk>- ei > 0; we 
can construct the Si-allocation associated with the reallocation from (k,i) to (k',i). 

We have proved that, indeed, (|13|) holds for the chosen values of jft. But this means that x = x* - this 
contradiction completes the proof. □ 

For any x, denote 

D min (x) = min D(~f,x), 
7 er(x) 

The minimum is attained and, obviously, D m i n (x) < for any x. 
Lemma 7. For any e > there exists t\ > such that 

\\ x ~ x*\\ > e implies D min (x) < (16) 

Proof. We use compactness of X. If the lemma statement does not hold, then, for some fixed e > we 
can find a converging sequence — > x' G X, n — > oo, such that D min (x^) — > and \\x — x'\\ > e. By 
Lemma [H there exists an Si-allocation 7 G r(x'). Since x' k > implies x^ > for all large n, wc can 
construct (for all large n) an Si-allocation 7W g r(x < -™^) associated with the same reallocation as the one 
producing 7. Then, we have 7^ — > 7 and D(^ n \ x^) — > D{-f,x') < 0. This contradicts the assumption 
D mm {x {n) ) -> 0. □ 



5 Proof of Theorem [2] 

Lemma 8. ylny FS'P (/or i/ie closed system under Greedy-D) is such that at any regular point t, 

^-F(x(t)) = D(v(t),x(t)) < D mm (x(t)). (17) 
at 

Proof. Within this proof, x = x(t) and v = v(t). Consider a specific allocation 7 G F(x), for which the 
minimum in the definition of D m i n (x) is attained; unless x = x* , 7 is an Si-allocation, associated with some 
fixed reallocation from (k,i) to (k',i). Define the following "distribution function": 

Function H(£;~/) is defined the same way, by replacing v with 7. Then, we must have 

#(£7), V£. (18) 

Indeed, consider any two edges (£,j) ^ (£' \j) with a common j, and suppose A^j < Aej. Then, in a fixed 
sufficiently small time interval [t, t + e], for all pre-hmit trajectories with sufficiently large parameter r, any 
j-customer whose service completes at a server with configuration £' cannot possibly be placed along the 
edge (£,j). (Here we use the fact that the system is closed: when a customer service is completed, there is 
always an option to place it back into the same server for the new service.) Furthermore, if either I 1 = ej or 
xi'- ej > 0, any j-customcr completing service in configuration I, will be placed for new service either along 
the edge {£',j) or possibly along another edge (£",j) with Ai»j < Afj. This proves fTS"]) , from which (jTTJ) 
easily follows. □ 

As a corollary of Lemma [5] we obtain 
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Lemma 9. Any FSP (for the closed system under Greedy-D) is such that 



x{t) -> x*. 



(19) 



The convergence is uniform across all initial states x(0) G X . 

Remainder of the proof of Theorem® We fix e > and choose T > large enough so that, for any FSP we 
have ||x(T) — x*|| < e. We claim that, as r — > oo, 



where the convergence is uniform across all initial states x r (0). This claim is true, because for an arbitrary 
sequence of fixed initial states x r (0), we must have 



because w.p.l we can always choose a subsequence of r along which the u.o.c convergence to an FSP holds. 
Claim ([20]) implies the result, because e can be arbitrarily small. □ 



6 Original system. Optimality of a modified version of Greedy-D 

We now return to our original "open" system. Unfortunately, the proof of Greedy-D optimality in the closed 
system does not carry over to the open system. The key reason can be informally described as follows. If 
v(t) is the exogenous arrival rate allocation vector in the fluid limit, then the result analogous to Lemma [8] 
no longer holds; namely, property (|18[) in its proof is not valid. Suppose we have an edge {£, j) on which the 
unique minimum min£< A^/j is attained, Xi- ej = 0, xg > and £j > 0. There is the non-zero rate fij£jX£ of 
departures along (£,j). In this case it is possible that V£j < (ijijXi, because some type j exogenous arrivals 
will find no servers in configuration £ — ej. (In the closed system we must have vej > /J,j£jX#, because all 
departures along (£,j) have the option of "coming back" along the same edge.) Therefore, the argument 
leading to (|18p . and in fact the property itself, does not hold. 

In this section we will prove the asymptotic optimality of a slightly modified version of the Greedy-D al- 
gorithm, called Greedy-DM, which, in a sense, "emulates" the behavior of Greedy-D in the corresponding 
closed system. Informally the key idea of Greedy-DM is to make decisions about placements of new exoge- 
nous type i arrivals a little "in advance" , at the instants of type i departures. This is achieved by using 
"placeholders," called tokens: when a type i departure occurs, we "pretend" that immediately a new type i 
arrival occurs, decide which server this arrival would be placed into according to the Greed-D rule, and place 
a type i token in that server. When new actual type i customers arrive, they first seek and replace type i 
tokens if there are any; if no type i token is available, the customer is placed according to the Grecdy-D. (In 
addition to being replaced by actual customers, we make the tokens "impatient" - they leave the system by 
themselves after a random exponentially distributed time.) The analysis in this section shows that, in the 
fluid limit, "all" tokens are replaced by actual arrivals and "all" actual arrivals replace tokens. This means 
that the allocation of actual customer arrival rates is "equal" to that of tokens; the latter, in turn, satisfies 
same properties as the rate allocation in the closed system. 

Definition 10 (Greedy-DM discipline). Suppose the weights Wk(x) — x^ are given, as in the definition 
of standard Greedy-D. At any given time there are two kinds of type i customers - actual customers being 
served as usual and tokens. For the purposes of defining server configurations k and the system state X , 
type i tokens are treated the same way as actual type i customers. 

When a departure of actual type i customer from configuration k at time t occurs, the following actions are 
taken: a new token of type i is immediately created; this new token is treated the same way as a new type 
i arrival, and is placed for "service " immediately; the token is added to an available configuration k ( with 
either k = or X k (t-) > 0) such that k + e» G K and the difference W k+et (X(t-)) - I{k ^ 0}W k (X(t-)) 



F{\\x r (T) 



x*\\ > 2e} -> 0, 



(20) 



limsup||a; r (r) -x*\\ < e, w.p.l, 
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is the smallest (where t— refers to the time after the service completion, but before the token placement). 
When a new exogenous arrival of an (actual) type i customer occurs, it replaces an arbitrarily chosen type i 
token (anywhere in the system), if such is available; otherwise, this arrival is added (as in the usual Greedy- 
D) to an available configuration k (with either k = or Xk(t—) > 0) such that k + e, G K, and the difference 
Wk+a {X{t— )) — I{k 7^ 0}Wk(X(t— )) is the smallest (where t— refers to the time just before the arrival). 
Any token of any type anywhere in the system "completes service" at the rate fio > 0, independently of 
anything else (i.e. the probability of service completion in a dt-long interval is p$dt + o(dt)); "service 
completion" of any token is treated the same way as service completion of an actual customer, except no new 
token is created. 

The random process, describing the evolution of this system is more complex, but it is still an irreducible 
Markov chain. A complete server configuration is by definition a pair (k, k), where vector k = (hi, . . . , ki) S K 
gives the numbers all customers (both actual and tokens) in a server, while vector k < k, k G 1C, gives the 
number of actual customers only. Therefore, the Markov process state at time t is the vector {X^ k k ^(t)}, 

where the index (k, k) takes values that are all possible complete server configurations, as described above. 
Obviously, Y[(t) < Y[(t) for all i and t, where Y[(t) and Yf(t) is the total number of all and actual type i 
customers, respectively, and superscript r, as usual, indicates the system with parameter r. Moreover, the 
behaviors of the processes (Y[(t), Y[(t)), t > 0, are independent across all i, with Y^oo) having Poisson 
distribution with mean p.^r. Finally, by the Greedy-DM definition, at any time any existing type i token 
"completes service" at the rate ^o- Using these facts, we easily establish the following 

Lemma 11. Markov chain {X r ^{t)}, t > 0, is positive recurrent for each r. Moreover, the distributions 

of {{yl (oo), y\ (oo)), i G 1} = (l/r){(y i r (oo), Y7"(oo)), i £l} are tight, and any limit in distribution 
{(y,:(oo), ?/i(oo)), i € 1} is such that j/i(oo) = pi and yi(oo) < pi + Xi/po for all i. Consequently, the 
distributions of {x r , k (oo)} = [\/r){X r -^(oo)} are tight. 

Proof. Consider the process (Y[ {•) ,Y[ {■)) for a fixed r and i. Consider a different process {Y[ (■) ,Y[ {■)) , 
defined as follows: actual type i customers arrive the same way, and depart after their service completion, 
so that Y?{t) is same as before; upon every service completion of an actual i-customer, one type i token 
is created, which then stays in the system for a random time, exponentially distributed with mean 1/po, 
independently of anything else, and then leaves the system; Y[(t) is the total number of all type i customers 
(actual and tokens) in the system. Obviously, processes (Y[ (■) ,Y[ (■)) and (Y[ {■) ,Y[ (■)) can be constructed 
on a common probability space so that Y[{t) < Y[{t). But, Y[(t) is simply the number of customers in the 
infinite-server system with input rate \r and mean service time l/pi + 1/po- Therefore, (Y^(-), Yf(-)) is 
positive recurrent; moreover, in stationary regime, Yf (oo) and Yf (oo) have Poisson distributions with means 
Xir(l/ fii + 1/po) and p^r, respectively. The lemma results easily follow. □ 

Let {x( k denote a vector with non-negative components, with indices (k, k) being all possible complete 
configurations. Denote by x = {xk}, y = {yi} and y = {yi} its projections, with components being 

Xk = X (k,k)' = 51 ^ iX (k,k) = ^^ki x k, Vi = ^ ^i X (k,k)- 

k : k<k (k,k) k (k,k) 

Denote by X the set of those values of {x^ k k ^} satisfying condition 

Vi = ili = Pi, Vi. 

Obviously, for any {x^ k ^ £ X , its x-projection is an element oi X . Also, note that the condition yi = yi, Vi, 
is equivalent to 

x (k k) = unless k = k, (21) 
and therefore (f2Tj) holds for any {x^ k k ^} G X. 
The main result of this section is 
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Theorem 12. Consider a sequence of original systems, indexed by r, under the Greedy-DM algorithm with 
a > 0. Let {x r „ -^(oo)} denote the random (complete) state of the fluid- scaled process in a stationary regime. 
Then, as r — > oo, 

d{{x r {kk) {^)},X) => 0, (22) 
x r {oo) => x*, (23) 

where x* is defined in 

For each (k,i) £ M. and k < k, consider independent unit-rate Poisson processes n^^^i), t > 0, and 
IL fe .(t), t > 0; and for each i G X - an independent unit-rate Poisson process fii(t), t > 0. Each of these 
processes satisfies the functional strong law of large numbers (FSLLN) analogous to ([5]). 

The Markov process {X* $.\{')} f° r each r is driven by the common set of independent Poisson processes 
n (fe,fe),i('): n (fc,fc),i(') an d fti(-), in the natural way, as follows: 

ir(t)=ft i (rA i t), 

where -D!' ~. .(t) and DJ" - .(t) is the number of the type-i departures from the configuration (k, k), due to 

the service completions of actual customers and tokens, respectively, and A\(t) is the number of the exogenous 
type-i arrivals of actual customers; all in the interval [0, t]. Clearly, the entire process sample path is (w.p.l) 
uniquely determined by the initial state {X* ^(0)}, the realizations of the driving Poisson processes and 

the Grccdy-DM discipline. In particular, the realizations of the following processes are uniquely determined: 
the number of type-i departures from configuration k due to actual and token service completions, and their 
total 

D r k ,M = E D r (k:k) .(t), DIM = ]T D r {kAi (t), DUt) = Dl ti (t) + DIM 

k k 

the number of type-i token "arrivals" allocated (upon type-i actual departures) along edge (k, i), A r k i (t); the 
number of typc-i actual exogenous arrivals allocated along edge (k,i), without replacing an existing i-token, 
A k *f (t) (such arrivals change the complete configuration from (k — e», fc — e^) to (fc, k); the number of type-i 
actual exogenous arrivals, that replace an existing token in configuration k, A k '^ (t) (such arrivals change the 

complete configuration from (k, k — ef) to (k, k) - so these do not count as arrivals into k); the total number 
of all type-i arrivals into configuration k, A k ,(£)(£) = A k i (t) + A* k *f(t). The following relations obviously 
hold: 

m)= E +4& r (*)]> v», 

k:{k,i)eM 

E KS)= E fyM Vi - ( 24 ) 

k:(k,i)£M k:(k,i)£M 

We introduce fluid-scaled processes: 

x i{t) = l -xm, *^ it) (*) = ^ it) (t), am = \m), 

and similarly defined 

dUt), &(*), dT ki (t), ~alM ^[(t), a* k *' r (t), alM 
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A set of Lipschitz continuous functions [{xfc( - )}; {x^ k fyi')}, {o-i(')}, {<^fei(')}) ■ • •] on the time interval [0, oo) 
we call a fluid sample path (FSP), if there exist realizations of driving Poisson processes, satisfying the 
FSLLN analogous to ©, and a fixed subsequence of r, along which 

[K(-)},{^ a) (-)},K(-)},K,:(-)},---]^ 

[{**(•)}.{*(*,«)(•)}, {«*(')}, {<**(•)},•••], u-o-c (25) 

Lemma 13. Suppose the initial states ^(0)} are /raed and are suc/i i/iai {#^^(0)} — > {^(^^(O)}. 
Then, w.p.l for any subsequence of r there exists a further subsequence of r, along which the convergence 
h25)) holds, where the limit is an FSP. 



Proof is, again, very standard - it is a more general version of that of Lemma [3] We omit details. □ 
Lemma 14. Consider an FSP with the initial state {x^ k k ^(fi)}. Recall notation: 

Vi(t) = ***(*,*)(*)> = 12 ^ x (fc,fc)(*)> 

(fc,fc) (fc.fc) 

and denote iji(t) = yi(t) — Vtit). Then, at any regular point t, for any i, 

(d/dt)y i (t)=\ i -my i (t), (26) 

{d/dt)Ut) = { - A *+^(*)-^(*), ^ 4 S > ° (27) 

In particular, for any i, the convergence 

(&(*)>&(*))-► (ft. 0), Vi, (28) 
ZioWs and is uniform in initial states {x^ k £)(0)} /rom a compact set, and 

(y i (0),y i (0)) = (p i ,0) tmjrftes (&(*), fc(t)) = 0), Vt. (29) 

Proof. Equation (|26p is very standard, describing an FSP for an M/M /oo system. Equation ([2"7]) for the 
y,(t) > case is also a very basic; (|27p for the yi{t) = case is easily verified by considering the behavior of 
pre-limit trajectories is a small interval [t,t + At], and considering the three cases, — Aj + Hiiji{t) — Mo2/i(A) < 0, 
= and > 0, separately. (See e.g. [5] for this type of argument in more detail.) We omit details. □ 



Lemma 15. Consider a sequence of original systems, indexed by r, under the Greedy-DM algorithm. Let 
{ X \k fe)(-°°^ denote the random complete state of the fluid-scaled process in a stationary regime. Then, any 
subsequence of r has a further subsequence, such that 

Kfc^M} ==* { x (fe,fe)(°°)}' 

where {x^ k j^(oo)} £ X w.p.l. 

Proof. Fix arbitrary 5 > 0, and a sufficiently large compact B so that for all large r, 

F[{x r (kJc) (oo)}GB]>l-5. (30) 

(We can do that by Lemma [11]) Fix arbitrary e > and choose T > large enough so that for any FSP 
with {X( k £.)(0)} G B, we have d({x^ k k ^{T)}, X) < e. (We can do that by Lemma [Ml ) Fix arbitrary S\ > 0. 
We claim that for all sufficiently large r, 

{x r {k k} (0)} e B implies P{d({^ fc - } (T)}, *) > 2e} < 6,. (31) 
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This claim is true, because for an arbitrary sequence of fixed initial states £n(0)} G we must have 

limsupd({x (fc ^JT)}, X) < e, w.p.l. 

r— ¥00 ' 

(This follows from Lemma fTST) By (P0|) and ([3~Tj) . for all large r, a stationary version of the process is such 
that 

F{d({x r (k:k) (T)}, X) < 2e} > (1 - S)(l - <J a ). 
Therefore, any limit-in-distribution {x^ k ^(T)} is such that 

nd({x (k ; k) (T)},X) < 2e} > hmsupPM^ -^T)},*) < 2e} > (1 - S)(l-S 2 ). 

Since 5, 62, £ are arbitrary positive, ¥[{xr k k JT)} £ X] = 1. □ 

Lemma 16. Consider an FSP with the initial state (x^ kk ^(0)) £ X. (In particular, x(0) £ X.) Then 
( x (kk)(t)) G X f or t > 0. In addition, at any regular point t, using notation u>ki{t) = (d/dt)d k i(t), 
w ki (t) = (d/dt)dki(t), v ki (t) = (d/dt)a ki (t), v k i(t) = (d / dt)a ki {t) , we have 

w ki (t) = w ki (t) = hfiiXk, V(fc, i) e M, (32) 
^2 v>ki(t)= = V * eI ' ( 33 ) 

«*(*) = 5w(*)> V(M)eM (34) 
(d/dt)x k (t) = [ ^2 v kt - ^ — [ X] ^ _ ^fc+ei.i]) Vfc G /C, (35) 

i:k-eiEK i-.k+aeK. i-.k-aElC i-.k+aeK 

j t F{x{t)) = D(v(t),x(t)) < D mm (x(t)). (36) 

Proof. Since we have (f2T)|). {x^ fc G X holds by definition. Relation (f3"2"j) holds because 

Wfci(f) = X h^iX {k - k) {t), w ki (t) - w ki (t) = y^(fcj - ki)^x {k k) {t), 

k<k k<k 

and {a^ fe £)(£)} G A?. We obtain (j33| from the limit form of ()24|1 . namely 

X a fc,i(*) = X ^m(*)> 

fe:(fc,i)eA4 k:(k,i)GM 

and from ^2 k w k i(t) = ^2 k w k i(t) = = A;. Relation (p4|) follows from the fact that v k i{t) > v k i(t), 

and the strict inequality cannot hold for any (k, i), because otherwise we would have for at least one i 

(d/dt) yi (t) = ^2v ki (t) -J2w ki {t) >^2^i(t)-^2w ki (t) = 0. 

A.* A? 

Equation (|35p is automatic: the RHS is just the difference between arrival and departure rates to/from 
configuration k. Finally, since v(t) = v(t), and the rates v k i(t) are those of "arriving" tokens (which 
immediately follow service completions of actual customers), the argument in the proof of Lemma [17] (for 
the closed system) applies, and we obtain ([36]) . □ 

As a corollary we obtain the analog of Lemma [9] 

Lemma 17. Consider an FSP (for the original system under Greedy- DM algorithm) with the initial state 
i x (k k) (0)} X- (In particular, x(0) G X .) Then 

x(t)^x*. (37) 

The convergence is uniform across all initial states in X. 
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Proof of Theorem [7]?l Convergence has already been proved in Lemma [15] We fix e > and choose 
T > large enough so that for any FSP with {x, k k \(0)} £ X, we have ||x(T) — x*\\ < e. We claim that for 
any 8\ > there exists a sufficiently small 82 > such that for all sufficiently large r, 

d({x r (kJc) (0)},X) < 8 2 implies F{\\x r (T) - x*\\ > 2e} < 6 lt (38) 
This claim is true, because for an arbitrary sequence of fixed initial states {x r ^ k ^(0)} — ^ wc must have 

lira sup ||x r (T) -x*\\ < e, w.p.l. 

r— >oo 

Constants e and <5i can be arbitrarily small; we also know that for any 82, ¥{d({x r , k ^, (co)}, X) < 82} — > 1 
as r — > 00. Therefore, claim (|38p implies (|23p. □ 

7 The case of vector-packing constraints (CD): Greedy algorithm 
with aggregate configurations 

So far in the paper we did not exploit a possible underlying structure of packing constraints. Instead, we 
worked with a formally defined set K. of possible configurations. Now we will consider a special case: suppose 
the configuration set K. is defined by vector packing constraints ([T}. 

We say that two configurations k and k' are equivalent if they require same total amounts of resources of 
each type: 

/] kjbj,n = / ] Kk,n, Vn. (39) 

i i 

A class of equivalent configurations is denoted q; we will call it aggregate configuration, or a- configuration. 
Zero a-configuration, denoted (with some notation abuse) by q — 0, is the one containing the sole config- 
uration k = 0; by convention X$ = 0, where the subscript can refer to cither zero configuration or zero 
a-configuration. The sets of all a-configurations and non-zero a-configurations are denoted by Q and Q, 
respectively. We write q{k) for the aggregate configuration containing k. We use notation 

and similarly for other quantities summed up over an a-configuration q. Clearly, vector {X q , q £ Q} is a 
projection of X = {Xk, k € JC}. 

In this section we show that (versions of) the Greedy algorithm, using quantities X q instead of Xk, asymp- 
totically minimizes 

which, again, approximates (when a > is small) the total number of occupied servers ^2 q X q (oo) = 
Xk(oo) in a stationary regime. The difference, however, is that \Q\ can be much smaller than |/C|, thus 
making the Greedy algorithm easier to implement in practice. 



7.1 Results. 

|£| 

For x 6 consider function 

$(*) = £(1 + a)-X +Q = $> + a)- X LE ^ 

9 9 fcGg 
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with parameter a > 0, and the following the convex optimization problem 



min $(x) (40) 

subject to 



^ hxk = pi, Vi. (41) 

We denote by X* the set of optimal solutions of this problem; obviously, X* C ,Y. 
Definition 18 (Greedy discipline with a-configurations). 

1 . Integral form (Greedy-I-AC) . A type i customer arriving at time t is added to an available a- configuration 
q (with either q = or X q (t—) > 0) such that the addition does not violate the vector packing constraints 
and the increment <5>(X(t)) — <&(X(t— )) is the smallest. The ties between a-configurations are broken 
according to an arbitrary deterministic rule. The choice of a server within the chosen a- configuration 
is random uniform. 



2. Differential form (Greedy- D- AC). For each q G Q, denote W q (x) = (d/dx q )$(x) = x", x G 



\K\ 



A type i customer arriving at time t is added to an available configuration k (with either q = or 
X q (t—) > 0) such that the addition does not violate the vector packing constraints and the difference 
Wq+ ei (X(t—)) — I{q ^ 0}W q (X(t—)) is the smallest. [Here q+et denotes the a- configuration containing 
configurations k + a, k G q (and possibly other configurations).] The ties between a-configurations 
are broken according to an arbitrary deterministic rule. The choice of a server within the chosen 
a- configuration is random uniform. 

Theorem 19. Consider a sequence of closed systems, indexed by r, and let x r (oo) denote the random state 
of the (fluid-scaled) process is a stationary regime, under the Greedy-D-AC discipline with a > 0. Then, as 
r — > co, 

d(x r (oo),X*) =► 0. 

Definition 20 (Grcedy-DM-AC discipline). This discipline, which uses tokens, is the modification of Greedy- 
D-AC, completely analogous to the modification of Greedy- D that leads to Greedy-DM. 



Theorem 21. Consider a sequence of original systems, indexed by r, under the Greedy-DM- AC algorithm 
with a > 0. Let \x' 
regime. Then, as r 



with a > 0. Let - (oo)} denote the random (complete) state of the fluid-scaled process in a stationary 



d({x r {kk) (oo)},X) =► 0, 
d(x r (oo), X*) 0. 

In the rest of this section we will consider the closed system under Greedy-D-AC, and will prove Theorem 
We will omit the proof of Theorem [21] which is "obtained from" that of Theorem [19] in exactly same way as 
the proof of Theorem [T2l was obtained from that of Theorem [5J 

7.2 Optimal set characterization and related properties. 

Using Lagrange multipliers rji for the constraints (|4ip. the Lagrangian of the problem P0|) - (|4T1) is 

E E »*] 1+Q + E * \» - E k ^ ■ 

q k£q i k 

We obtain the following characterization: vector x £ X is an optimal solution of ([40| -(|4T |) (i.e. x € X*) if 
and only if there exist constants rji such that (using notation Uk = max{^,- £^,0}) 

x" = maxufc, Vq G Q, (42) 

k£q 
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Uk < max Uh' implcs xi, = ) , Vfc G IC. (43) 

More notation. Consider the following order relation on Q: q' < q if k' < k for some k' G q' and k € q. q' < q 
means q' < q and q' ^ q. If a-configuration q contains at least one k with ki > 0, we use a (slightly abusive) 
notation q — a for the a-configuration containing k — e%, i.e. — e, = {fc — a \ k G <?, ki > 0}; otherwise, 
q — ei = 0. Denote by ./VP the set of pairs (q, i) such that q — e.; ^ 0. For x G W + and (g, i) G M a denote 

Aqi = A. q i(x) = X q — X q _ ei - 

Lemma 22. Consider the following property of an element x G X . (We will refer to it as NSI-property - 
'Wo Simple Improving allocation"). For any two elements (<?,«), (q',i) G A4 a (with q ^ q' , but a common i), 
condition 

\'i < A gi (44) 

implies either 

Xk = for all k G q such that ki > (45) 

or 

q' — ej ^ and x q i_ ei = 0. (46) 
If x £ X satisfies the NSI-property, then condition holds. 

Proof. Consider x G X satisfying the NSI-property For each i define 

£ . = min A„cMi, £, = max A„(m ,-. 

fe: fc;>0, x k >0 9(fe) ^' Sl k: k t >0, x k >0 qWA 

It is easy to check that we cannot have > and £. < 0, because this would violate the NSI-property. 
Then, we can further define 

77-1 ^ if ^ >0 (47) 
Vt - W if e < 10 

Denote by I + the subset of those i with r?, > 0, and by I~ = X \ I + the remaining subset. It is easy to 
check that for any fixed i G I~ , we must have 

A 9 (fc).i = rji for all k such that ki > 0, Xk > 0, (48) 

otherwise a contradiction to NSI-property is obtained. 

Using notation Uk = max{J^ i fci?7i,0}, let us define the values x q via 

[x°] a = max 1^. 

k£q 

It is easy to check that 

- K-eT > m, V(q^)eM a . (49) 

Let us prove (|42j) . which is equivalent to x q = x q , Vq. Suppose this is not true. Consider a minimal q for 
which x q ^ X q . 

Case (el): suppose x q > x q . Then necessarily x q > 0. Consider any k G q with Xk > 0. 
Sub-case (cl.l): suppose ki > for some i G I~ . Fix this i and denote q' = q — e^. If q' = 0, we obtain a 
contradiction to the definition of rji, and so q' ^ must hold. Then ay = x q , by definition of q (as a minimal 
counterexample). Using (|49|) we obtain x^ — x^, = A 9 ..; > r\i - a contradiction to (|48|) . Thus sub-case (cl.l) 
is impossible. 

Sub-case (cl.2): suppose ki > implies i G I + and then 77,; > 0. Fix an i with ki > 0, and consider 
q' = q — ei. If q' = 0, we obtain a contradiction to the definition of r]i, and so q' 7^ must hold. We have 
x q i = X®, by definition of q; therefore, using (|49p. x^ — x^, = A q j > rji - a contradiction with the definition 
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of r)i. Sub-case (cl.2), and then case (cl), is impossible. 

Case (c2): suppose x q < x q . Then necessarily x q > 0. Let us fix a k S q, on which the maxt 6 ,ut > is 
attained, and so u k > 0. 

Sub-case (c2.1): suppose fc, > for some i 6 X~ . Fix this i and denote q 1 = q — ej. We cannot have q' = 0, 
because that would imply rji > 0. Therefore, q' ^ and [x q ] a — [x q i\ a = r?j, implying in particular x q , > 0. 
But, x q i = x q , and therefore x q — x ql = A q j < rji. Recalling the definition of rji, we obtain a contradiction 
to NSI-property. Thus, sub-case (c2.1) is impossible. 

Sub-case (c2.2): suppose fcj > implies i S I + and then rji > 0. Fix an i with fej > 0, and consider q' = q — &i- 
If q' = 0, we have A g ,j < - a contradiction to NSFproperty. Therefore, q' ^ and x q > = x° q , > (because 
r/i > for all i with fcj > 0). Then, A 9 ^ < r?j and we, again, obtain a contradiction to NSI-property Sub-case 
(c2.2), and then case (c2), is impossible. 
The proof of (14 2 p is complete. □ 

7.3 Fluid sample paths. 

We now define fluid sample paths for the closed system under Greedy-D-AC algorithm. First, we will specify 
the construction of the process itself. In addition to the set of unit-rate Poisson processes, driving the service 
completions, we define primitive processes (common for each r), driving the random uniform assignment of 
customers "within" each a-configuration q. Namely, for each q we define an i.i.d. sequence £ g (l),£ g (2), . . . 
of random variables, uniformly distributed in [0, 1]. The configurations k £ q are indexed by 1,2,..., \q\ (in 
arbitrary fixed order). When an m-th customer of any type is assigned to a-configuration q (with m referring 
to the order of assignment since initial time 0, and not to the customer type), this customer is assigned to 
a server in configuration kl indexed by 1 if 

f,(m) e [o,x r k ,/x r q ], 

it is assigned to a server in configuration k" indexed by 2 if 

Urn) G (Xl,/X r q ,(X r k ,+XZ„)/X r q ], 

and so on. (Note that necessarily X q > - otherwise there would be no assignment to a-configuration q.) 
Denote 

[rsj 

g^,c) = £^M<a 

m—l 

where s>0, < < 1 , and [-J denotes the integer part of a number. Obviously, from the strong law of 
large numbers (SLLN) and the monotonicity of g q (s, £) on both arguments, we have the following functional 
SLLN 

9q(s,0 ->■ <, u.o.c. w.p.l (50) 

Clearly, the realization of the process is uniquely determined by the initial state and the realizations of 
driving processes U k i(-) and (£ 9 (1), £ g (2), . . .). 

A set of Lipschitz continuous functions [{xfc(-), k e K.}, {dki(-), (k,i) £ A4},{aki(-), (k,i) S M}] on the 
time interval [0, oo) we call a fluid sample path (FSP), if there exist realizations of Ilfc^(-) satisfying ([8]). 
realizations of (£ g (l), £ 9 (2), . . .) satisfying (|5U)) . and a fixed subsequence of r, along which convergence (jTU)) 
holds. 

It is easy to see that the family of all FSPs is uniformly Lipschitz. 

We can easily verify that Lemmas [3] and 2] hold as is for Greedy-D-AC algorithm 
lemma is analogous to Lemma [8] (and has essentially same proof). 

Lemma 23. Any FSP is such that at any regular point t, 



Further, the following 



(51) 



18 



Moreover, unless x(t) satisfies N SI- condition, the inequality H51\ ) is strict. 

We will need the following FSP property, which follows from the random uniform rule of Greedy-D-AC for 
assignments within each a-configuration. 

Lemma 24. Consider an FSP. Suppose that at some t > 0, x — x(t) is such that for some k £ fC and i we 
have: Xk > 0, ki > 0, k' = k — e; ^ 0, xy = 0, and x q > > where q' = q(k'). Then t is not a regular point. 

Proof. Suppose t is a regular point. We must have v k'+e i,i'(t) = 0. Indeed, in a small interval [t,t + 5], 
for all sufhciently large r, the pre-limit sample paths defining the FSP are such that 

< c6 

x r q , x q , - C8" 

where C > is some constant (depending on the Lipschitz constants for FSP components). This, along with 
(|50[) . implies that the fraction of the customers added in [t, t + S] to servers in configuration k' , among those 
added to a-configuration q' , is upper bounded by the RHS, which can be made arbitrarily small by choosing 
sufhciently small S. This means that Y^i'( a k'+ei,i'{t + 6) — ak>+ e -,.i'{t))/S -J, as 5 — > 0, which implies 
J2i' Vk'+e it ,i'(t) = 0. Since Xk> = 0, obviously, J2i> w k>s>(t) = 0. However, Wk,i(t) = (J>ifaxk > 0. Therefore, 
(d/dt)xk'(t) > 0. This is a contradiction, because if t is regular, Xk'(t) = implies (d/dt)xk>(t) — 0. □ 

We will also need the continuity and shift properties of the FSPs, which are quite generic. (See Sections 
5 and 6 in [7|. Although our model is different, essentially same proofs as in [7] apply.) The time shift 
by d > 0, applied to an FSP [{xk(-)},{dki(-)},{aki(-)}}, produces the set of functions with the same time 
argument t > 0, but with Xk{t) replaced by Xk(0 + t), dki{t) replaced by dki(0 + 1) — dki(0), aki(t) replaced 
by a H (e + t)-a ki {6). 

Lemma 25. The family of FSPs satisfies the following properties. 

(i) Continuity: If there is a converging sequence of FSPs, indexed by f3, namely 

[{*f(0}> {<$(■)}, {<$(■)}] -» [{**(•)}. {<**(•)}, {*(•)}], u.o.c, 
then the limit is also an FSP. 

(ii) Shift (or "Memoryless"): The time shift of any FSP by any > is also an FSP. 

Proof, (i) For each fixed index (3, and the FSP associated with it, consider a sequence of (scaled) sample 
paths of the process, that define this FSP: 

Then, we can choose a subsequence of r, and the corresponding f3 = /3(r), so that 

[{*i? Cr) ' r) (.)}, {dg (r),p) (-)}, {4- W,r) (-)}] -V [{x k (-)h {««(•)}], u.o.c, 

and therefore the limit satisfies the definition of an FSP. 

(ii) We pick a sequence of (scaled) sample paths of the process, that define the FSP. It is easy to see that 
the time shifts of these sample paths define the FSP which is the time shift of the original one. □ 

We are now in position to prove the following lemma, which is key (along with Lemmas [52] and 124)) in our 
analysis of Greedy-D-AC algorithm. 

Lemma 26. Consider an FSP. Suppose t is a regular point and x(t) ^ X* . Then 

(d/dt)<f>(x(t)) < 0. (52) 
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Proof. Suppose not, namely (d/dt)$(x(t)) = 0. Then for x = x(t) the NSI-condition holds, and therefore 
(|4"2"]l holds as well. We will obtain a contradiction. Condition (|4"51) does not hold (otherwise, (f4"2"]l and (f4"3"]l 
would imply a; £ A""). Then, consider a minimal a-configuration q for which (|43|) is violated, namely: for 
some k G q, 

Ufc < maxit/j', Xk > 0. (53) 

fc'Gq 

We will show that this is impossible. First, obviously, x q > 0. 

Case (c3.1): suppose fc; > for some i G X~ . Fix this i and consider q' = q~ a. If q' = 0, we have A Qi i > 0, 
which means r\i cannot be negative - a contradiction. Therefore, q' ^ and we must have x q i > (because 
otherwise A 9i i > leads, again, to the contradiction with rji < 0). Consider the set p = {k' + a | k' G q'} C q; 
i.e., these are the configurations in q that are obtained by adding one type i customer to configurations in 
q' - it may or may not be a strict subset of q. If maxfc/ ep < maxfc/ gg uy, then, since max//^' uy + r\i < 
maxfc/gp «(,/, we obtain A q j > rji, which, along with Xk > 0, leads to the contradiction with NSI-property. 
Therefore, max/^gp it// = max^gi, u^. Then, there exists fc" G argmax fc , g(J , uy such that Xk" > and 
k" + ei G argmax fc , g(? uj,/. (Here we used the fact that for any k' G q' , condition (j43]) does hold - recall that 
q is a minimal a-configuration for which (|43[) is violated.) Note also that Uk- ei < uiaxfe' eg / m/j/ = it/// and 
Xfe_ ei = (otherwise, again, (|43[) would be violated at < q). We see that a; and the edge (k,i) satisfy 
conditions of Lemma [24l This means t cannot be regular. Thus, case (c3.1) is impossible. 
Case (c3.2): suppose for any i with ki > we have i G I + . Fix one such i and consider q' = q — e^. If q' = 0, 
we have A q ^ > r\i > - a contradiction with the definition of r/^ . Therefore, q' ^ and then we must have 
x q ' > (because 77^ > for each j with kj > 0). Consider the set p = {k' + e.- L \ k' G q'} C q. Note that 
maxfc/gq/ u fc ' > and max^g,- u k > + r\i = max fc / 6p uy > 0. If max fc > ep u fc - < max^g, it fc /, then A g>i > 77, > 0, 
which (along with > 0) contradicts the definition of r/i. Therefore, max// Sp it/^ = max^'g, Uk>- From this 
point on, the argument leading to a contradiction repeats that in the case (c3.1) verbatim. Thus, the case 
(c3.2) is impossible. The proof is complete. □ 

Lemma 27. For any T > and e > 0, there exists S > such that the following property holds uniformly 
on all FSPs and all to > 0: 

d{x(t) 7 X*)>e,te[h,t + T] implies $(x(t + T)) - $(x(t Q )) < -5. (54) 

Proof. If (|54| would not hold, we would be able to construct a sequence of FSPs converging u.o.c. to an 
FSP such that 

d(x(t), X*) > e and <Z>(x(t)) = $(x(0)), t G [0,T]. 

(Here we use the shift, continuity and uniform Lipschitz properties of the family of FSPs, and the fact that 
X is compact.) This is not possible, because by Lemma [26l we must have (d/dt)$(x(t)) < at every regular 
point in [0,T]. □ 

As a corollary, we obtain the following analog of Lemma |9] 
Lemma 28. Any FSP is such that 

d{x(t), X*) -> 0. (55) 
The convergence is uniform across all initial states x(0) G X. 

7.4 Proof of of Theorem [19l 

The rest of the proof of Theorem [19] is same as that of Theorem [2] 



8 Some generalizations 

A number of generalizations of our results are not difficult to obtain. We will discuss Theorems [2] and [12] to 
be specific, but analogous generalizations apply to Theorems \T§\ and [2T1 
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8.1 A different procedure for placing arrivals. 

Theorems [2l and [T2l require that when a type i customer (in Theorem [2]) or a type i token (in Theorem IT2|) is 
assigned for service, it is placed along the edge (k, i) minimizing the weight differential A^ = Aki{X(t— )). 
The procedure of choosing the edge to place a customer (or token) can be replaced by the following one, 
which might be easier to implement in some scenarios. We compare Ak'i for the edge (k',i) along which 
a type i departure just occurred, to the A k i for one edge, selected randomly as follows: with probability 
e S (0, 1) we select edge (e^, i); with probability 1 — e wc pick a non-empty server uniformly at random and, 
if its configuration I is such that k = £ + G IC, we select edge (k,i). (e is a fixed parameter.) Now, if 
A ki < Afe/j for the selected edge (k, i) (if any), we place the customer (or token) along (k,i); otherwise, we 
place it "back" along (k',i). It is not difficult to see that the proofs of Theorems [21 and [T2l still hold when 
Greedy-D and Greedy-DM algorithms, respectively, are adjusted as described above. 

The described alternative procedure generalizes the results in the sense that we can, for example, use this 
procedure with a fixed probability d £ [0, 1] and use the the "old" procedure (picking the smallest differential 
Afcj) with probability 1 — 5, and the results still hold. 

8.2 More general input processes and service time distributions. 

Theorems [2] and [12] still hold for much more general input processes and service time distributions (as opposed 
to Poisson and exponential, respectively). For example, a simple (but still far reaching) generalization is for 
the case when, for each customer type i, the input process is renewal (i.i.d. interarrival times, with mean 
l/(A-sr) an d finite variance) and the service time distribution Gj(£) (with mean J £dG(£) = has the 

"hazard rate" lower bounded by /i™ m e (p,m]: dG(g)/[l - <?(£)] > n? in dti, V£ > 0. In this case, we observe 
that the key conservation laws still hold for the fluid limit of the stationary system: (a) the "amount" of 
type i fluid is pi and remains constant and (b) the total rate of (actual) type i departures is and remains 
constant. In addition, say in the proof of Theorem IT21 to be specific, the corresponding FSPs are such that 
(actual) type i departure rate from state (k, k) is lower bounded by /x" kiX^ h ^(t). (Of course, the FSPs 
need to be defined more generally, to account for elapsed service times.) Given these properties, the entire 
argument goes through essentially as is. And, clearly, these properties hold under the input flow and service 
time assumptions still far more general than in the simple case described above. 



9 Discussion 

We have shown that (versions of) the Greedy algorithm are asymptotically optimal in the sense of minimizing 
the objective function J2 k Xl +a with a > 0. When a is small (but positive), the algorithms produce an 
approximation of a solution minimizing the linear objective X k , i.e. the total number of occupied servers. 
If J^k is the "real" underlying objective, the "price" we pay by applying Greedy algorithm with small 
a > is that the algorithm will keep non-zero amounts ( "safety stocks" ) of servers in many "unnecessary" 
(from the point of view of linear objective) configurations k, including many - potentially all - non-maximal 
configurations in JC. What we gain for this "price" is the simplicity and agility of the algorithm. "True" 
minimization of the linear objective ^2 k X k requires that a linear program is solved (via explicit offline or 
implicit dynamic approach), so that the system is prevented from using "unnecessary" configurations k, not 
employed in optimal LP solutions. 

The Greedy algorithm with a > is asymptotically optimal as the average number r of customers in the 
system goes to infinity. The fact that it maintains safety stocks of many configurations, means in particular 
that the algorithms' performance is close to optimal when the ratio r/|/C| is sufficiently large, so that there is 
enough customers in the system to keep non- negligible safety stocks of servers in potentially all configurations. 
If the number AT| of configurations is large, then r needs to be very large to achieve near-optimality. The use 
of aggregate configurations in the special case of vector-packing constraints alleviates this scalability issue 



21 



when the number \Q\ of aggregate configurations is substantially smaller than \K,\. 

Finally, we note that the closed system, considered in Theorems [5] and 1191 is not necessarily artificial. For 
example, it models the scenario where VMs do not leave the system, but can be moved ("migrated") from 
one host to another. In this case, a "service completion" is a time point when a VM migration can be 
attempted. 
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